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Abstract 

Interest in the analysis of networks has grown rapidly in the new millennium. Consequently, we 
promote renewed attention to a certain methodological approach introduced in 1974. Over the suc- 






,-C ceeding decade, this £u>o-stage-(iou&Ze-standardization and hierarchical clustering (single-linkage- 

q ■ like)-procedure was applied to a wide variety of weighted, directed networks of a socioeconomic na- 

O . 

c/2 , ture, frequently revealing the presence of "hubs" . These were, typically-in the numerous instances 

studied of migration flows between geographic subdivisions within nations- "cosmopolitan/non- 

°c/2 ■ 
£>y provincial" areas, a prototypical example being the French capital, Paris. Such locations emit 

Qh< and absorb people broadly across their respective nations. Additionally, the two-stage procedure- 

which "might very well be the most successful application of cluster analysis" (R. C. Dubes, 

Q ' 1985)-detected many (physically or socially) isolated, functional groups (regions) of areas, such as 

in " 

the southern islands, Shikoku and Kyushu, of Japan, the Italian islands of Sardinia and Sicily, and 

the New England region of the United States. Further, we discuss a (complementary) approach 

O 

developed in 1976, in which the max-flow/min-cut theorem was applied to raw /non- standardized 



o 



(interindustry, as well as migration) flows. 



^ ■ PACS numbers: Valid PACS 02. 10. Ox, 02.50.-r, 89.65.Cd, 89.65. Gh, 89.75.Hc 
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I. INTRODUCTION 



A. L. Barabasi, in his recent popular book, "Linked", asserts that the emergence of hubs 
in networks is a surprising phenomenon that is "forbidden by both the Erdos-Renyi and 
Watts-Strogatz models" [1, p. 63] [2, Chap. 8]. Here, we indicate an analytical framework 
introduced in 1974 that the distinguished computer scientist R. C. Dubes, in a review of the 
compilation of multitudinous results 3J, asserted "might very well be the most successful 
application of cluster analysis" [4j, p. 142]. This two-stage methodology has proved insight- 
ful in revealing-among other interesting relationships-hub-like structures in networks of 
(weighted, directed) internodal flows. This approach, together with its many diverse socioe- 



conomic applications, was documented in a large number of 



journal articles (among them 



,y,y,y,y,u, 



ii 



as well as in the research institute monographs 
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Though the principal procedure to be detailed here is applicable in a wide variety of 
social-science settings |3j,|4j, it has been primarily used, in a demographic context, to study 
the internal migration tables published at regular periodic intervals by most of the nations 
of the world. These tables can be thought of as iV x N (square) matrices, the entries (my) 
of which are the number of people who lived in geographic subdivision % at time t and j at 
time t + 1. (Some tables-but not all-have diagonal entries, ma, which may represent either 
the number of people who did move within area i, or simply those who lived in i both at 
t and t + 1. It can sometimes be of interest to compare analyses with zero and nonzero 
diagonal entries [23j.) 



II. TWO-STAGE METHODOLOGY 



A. First Step: Double-Standardization 



In the first step (iterative proportional fitting procedure [IPFP] 39]) of the methodology 
under discussion here, the rows and columns of the table of flows are alternately (bipro- 
portionally [40]) scaled to sum to a fixed number (say 1). Under broad conditions-to be 
discussed below-convergence occurs to a "doubly-stochastic" (bistochastic) table, with row 



and column sums all simultaneously equal to 1 4l|, |42|, |43|, |44j, |45|, |46j . The purpose of the 
scaling is to remove overall (marginal) effects of size, and focus on relative, interaction ef- 
fects. Nevertheless, the cross-product ratios (relative odds), m ' jr " fci , measures of association, 
are left invariant. Additionally, the entries of the doubly-stochastic table provide maximum 
entropy estimates of the original flows, given the row and column constraints 43, l48 |. 

For large sparse flow tables, only the nonzero entries, together with their row and column 
coordinates are needed. Row and column (biproportional) multipliers can be iteratively 
computed by sequentially accessing the nonzero cells [49J . If the table is "critically sparse" , 
various convergence difficulties may occur. Nonzero entries that are "unsupported" -that is, 
not part of a set of N nonzero entries, no two in the same row and column- may converge 
to zero and/or the biproportional multipliers may not converge [3j, p. 19] 50| 5l|, p. 171]. 



The "first strongly polynomial-time algorithm for matrix scaling" was reported in 52]. 

The scaling was successfully implemented, in our largest analysis, with a 3, 140 x 3, 140 

;. 965-70 intercounty migration table-having 94.5% of its entries, zero-for the United States 

9|, l23j, as well as for a more aggregate 510 x 510 table (with State Economic Areas as 

the basic unit) for the US for the same period [14J . (Smoothing procedures could be used 

to modify the zero-nonzero structure of a flow table, particularly if it is critically sparse 



53 



541 ]. If one takes the second power of a doubly-stochastic matrix, one obtains another 



such matrix, but smoother in character. One might also consider standardizing the ith row 
[column] sum to be proportional to the number of non-zero entries in the zth row [column].) 

B. Second Step: Strong Component Hierarchical Clustering 



In the second step of the two-stage procedure, the doubly-stochastic matrix is converted 
to a series of directed (0,1) graphs (digraphs), by applying thresholds to its entries. As 
the thresholds are progressively lowered, larger and larger strong components (a directed 
path existing from any member of a component to any other) of the resulting graphs are 
found. This process (a simple variant of well-known single-linkage [nearest-neighbor or min] 
clustering 55j) can be represented by the familiar dendrogram or tree diagram used in 
hierarchical cluster analysis and cladistics/phylogeny (cf. |56l. l57|). 



C. Computer implementations 



A FORTRAN implementation of the two-stage process w as g iven in [58|, as well as a 
realization in the SAS (Statistical Analysis System) framework [59(. Subsequently, the noted 
computer scientist R. E. Tarjan 60J devised an 0(M(logN) 2 ) algorithm 6JJ and, then, a 
further improved 0(M(logN)) method 62], where N is the number of nodes and M the 
number of edges of a directed graph. (These substantially improved upon the earlier works 



58|, [59], which required the computations of transitive closures of graphs, and were O(MN) 



in nature.) A FORTRAN coding-involving linked lists-of the improved Tarjan algorithm 



62j | was presented in [63], and applied in the aforementioned US intercounty study 



23 



the graph-theoretic (0, restructure of a network under study is not strongly connected 
independent two-stage analyses of the subsystems of the network would be appropriate. 
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D. Goodness-of-fit 



The goodness-of-fit of the dendrogram generated to the doubly-stochastic table itself can 



be evaluated-and possibly employed, it would seem, as an optimization criterion (cf. [65j, p 



210] [661, sec. 3]). In the context-not of the weighted, directed networks under discussion 
here-but of (0,1) -networks or simply graphs, Clauset, Moore and Newman have written: 
"[t]he method known as hierarchical clustering groups vertices in networks by aggregating 
them iteratively in a hierarchical fashion. However, it is not clear that the hierarchical 
structures produced by these and other popular methods are unbiased, as is also the case 
for the hierarchical clustering algorithms of machine learning. That is, it is not clear to 
what degree these structures reflect the true structure of the network, and to what degree 
they are artifacts of the algorithm itself. This conflation of intrinsic network properties with 
features of the algorithms used to infer them is unfortunate ... we give a precise definition of 
hierarchical structure, give a generic model for generating arbitrary hierarchical structure in 
a random graph, and describe a statistically principled way to learn the set of hierarchical 
features that most plausibly explain a particular real- world network". [661 ]. 



Distances between nodes in the dendrogram satis: 
metric inequality, d%j < max (da,, djk) 67, p. 245] [68 



y the (stronger than triangular) ultra- 
eq. (2.2)]. 



III. EMPIRICAL RESULTS 

A. Cosmopolitan or Hub-Like Units 

1. Internal migration flows 

Geographic subdivisions (or groups of subdivisions) that enter into the bulk of the den- 
drogram at the weakest levels are those with the broadest ties. Typically, these have been 
found to be "cosmopolitan" , hub-like areas, a prototypical example being the French capital, 
Paris |3J, sec. 4.1] 6j. Similarly, in parallel analyses of other internal migration tables, the 



cosmopolitan/non-provincial natures of London [69J], Barcelona Il6j] |3|, sec. 6.2, Figs. 36, 
37], Milan hj U sec. 6.3, Figs. 39, 401 (cf. 131), Amsterdam U p. 78] 25J], West Berlin 

n fin 

[3|, p. 80], Moscow (the city and the oblast as a unit) [19j |3j, sec. 5.1 and Figs. 6, 7], Manila 
(coupled with suburban Rizal) [70j, Bucharest [18J, Ile-de-Montreal [3|, p. 87], Zurich, Santi- 
ago, Tunis and Istanbul [7l| were-among others-highlighted in the respective dendrograms 
for their nations [3j, sec. 8.2] 15j, pp. 181-182] [8|, p. 55]. In the intercounty analysis for the 
US, the most cosmopolitan entities were: (1) the centrally located paired Illinois counties 
of Cook (Chicago) and neighboring, suburban Du Page; (2) the nation's capital, Washing- 
ton, D. C; and (3) the paired south Florida (retirement) counties of Dade (Miami) and 
Broward (Ft. Lauderdale) |9j, |23j, 172J • In general, counties with large military installations, 



large college populations or state capitals also interacted broadly with other areas [23j, p. 



.53]. Application of the two-stage methodology to 1965-66 London inter-borough migration 



251 1 indicated that the three inner boroughs of Kensington and Chelsea, Westminster, and 

n 

Hammersmith acted-as a unit-in a cosmopolitan manner [3, sec. 5.2, Fig. 10]. (In sec. 8.2 
and Table 16 of the anthology of results [3j, additional geographic units and groups of units 
found to be cosmopolitan with regard to migration, are enumerated.) 

It should be emphasized that although the indicated cosmopolitan areas may generally 
have relatively large populations, this can not, in and of itself, explain the wide national ties 
observed, since the double-standardization, in effect, renders all areas of equal overall size. 
(However, to the extent that larger areas do have fewer zero entries in their corresponding 
rows and columns, a bias to cosmpolitanism may in fact be present, which should be carefully 
considered. Possible corrections for bias were discussed above in sec. IHAO If one were to 
obtain a (zero-diagonal) doubly-stochastic matrix, all the entries of which were simply j^, 



it would indicate complete indifference among migrants as to where they come from and to 
where they go. A maximally cosmopolitan unit would be one for which all the corresponding 
row and column entries were -^-j- (if all the diagonal entries, ma, are a priori zero). (It seems 
interesting to note that cosmpolitan areas appear to have a certain minimax character, that 
is, the maximum doubly-stochastic entry for the corresponding row and column tends to be 
minimized.) 

2. Trade and interindustry flows 

The nation of Italy possessed the broadest ties in a two-stage analysis of the value of 
1974 trade between 113 nations, followed by a closely-bound group composed of the four 
Scandinavian countries Yj\ [3j, sec. 5.6, Fig. 22]. In a two-stage study (but using weak 
rather than strong components of the associated digraphs) of the 1967 US interindustry 
transaction table, the industry with the broadest (most diffuse) ties was found to be Other 

nn n 

Fabricated Metal Products llQ|,|73| U% pp. 13-18]. 

3. Journal citations 

In a two-stage analysis of 22 mathematical journals, the Annals of Mathematics and In- 
ventiones Mathematicae were strongly paired, while the Proceedings of the American Math- 
ematical Society was found to possess the broadest, most diffuse ties [8J. 

In a recent, large-scale (N > 6000) journal-to-journal citation analysis, decomposing "the 
network into modules by compressing a description of the probability flow", Rosvall and 
Bergstrom preliminarily omitted from their analysis the prominent journals Science, Nature 



and Proceedings of the National Academy of Sciences [74j, p. 1123]. (Those are precisely 
the ones that would be expected to be "cosmopolitan" or hub-like in character, and to be 
highlighted in a corresponding two-stage analysis.) Their rationale for the omission was that 
"the broad scope of these journals otherwise creates an illusion of tighter connections among 
disciplines, when in fact few readers of the physics articles in Science also are close readers 
of the biomedical articles therein" . (In [24|, pp. 125-153], we reported the results of a partial 
hierarchical clustering-not a two-stage analysis, but one originally designed and conducted 
by Henry G. Small and William Shaw-of citations between more than 3,000 journals. The 



clusters obtained there were compared with the actual subject matter classification employed 
by the Institute for Scientific Information.) 

B. Functional Clusters of Units 

1. Internal migration regions 

Geographically isolated (insular) areas-such as the Japanese islands of Kyushu and 
Shikoku [5|-emerged as well-defined clusters (regions) of their constituent (seven and four, 
respectively) subdivisions ("prefectures" in the Japanese case) in the dendrograms for the 
two-stage analyses, and similarly the Italian islands of Sicily and Sardinia [12], the North 



and South Islands of New Zea 



B p. 



Edward Island [3|, p. 90] (cf. [75 



ana 



, and the Canadian islands of Newfoundland and Prince 



76j). The eight counties of Connecticut, and other New 
England groupings, as further examples, were also very prominent in the highly disaggre- 
gated US analysis [23] . Relatedly, in a study based solely upon the 1968 movement of college 



students among the fifty states, the six New England states were strongly clustered [111 . Fig. 
1]. Employing a 1963 Spanish interprovincia. migration tab.e, well-defined regions were 

ormed by the two provinces of the Canary Islands, and the four provinces of Galicia [16J 
sec. 6.2.1, Fig. 37]. The southernmost Indian states of Kerala and Madras (now Tamil 



Nadu) were strongly paired on the basis of 1961 interstate flows [22J]. A detailed comparison 
between functional migration regions found by the two-stage procedure and those actually 
employed for administrative, political purposes in the corresponding nations is given in sec. 
8.1 and Table 15 of [3j. 

It should be noted that it is rare that the two-stage methodology yields a migration region 
composed of two or more noncontiguous subregions-even though no contig uity information 



at all is present in the flow table nor provided to the algorithm (cf. 54|, |77j]). A notable 
exception to this rule was the uniting of the northern Italian region of Piemonte-the location 
of industrial Turin, where Fiat is based-with southern regions, before joining with central 
regions, in an 18-region 1955-70 study [l3| [3J, p. 75] (cf. [12J|). 



2. Intermarriage and interindustry clusters 



In a two-stage analysis of a 32 x 32 table of birthplace of bridegroom versus birthplace 
of bride of 1947 Australian intermarriages [78]], Greece and Cyprus were the strongest dyad 
Id, sec. 5.7, Fig. 25]. 

In the 1967 US interindustry two-stage {weak component) analysis, two particularly 
salient pairs of functionally-linked industries were: (1) Stone and Clay Products, and Stone 
and Clay Mining and Quarrying; and (2) Household Appliances and Service Industry Ma- 



chines (t 
former) 



re latter industry purchases laundry equipment, refrigerators and freezers from the 



10 



73| |2J, pp. 13-18]. 



IV. STATISTICAL ASPECTS 



It would be of interest to develop a theory-making use of the rich mathematical structure 
of doubly-stochastic matrices-by which the statistical significance of apparent hubs and 
cluster in dendrograms produced by the two-stage procedure cou.d be evaluated [23, pp. 
7-8] [79]. In the geographic context of internal migration tables, where nearby areas have 
a strong distance- ad version predilection for binding, it seems unlikely that most clustering 
results generated could be considered to be-in any standard sense- "random" in nature. On 
the other hand, other types of "origin- destination" tables, such as those for occupational 



code) confusions |3|, sec. 9.8 



24 . pp. 125-153], interindustry (input-output) flows [lOj 



mobility [80j, journal citations 
|73j, brand-switches [3j, sec. 9.6 



8l| . crime-switches [3|, sec. 9.7] [82j, Table XII], and (Morse 
among others, clearly lack such a geographic dimension 
(cf. [84]). An efficient algorithm-considered as a nonlinear dynamical system-to generate 



random bistochastic matrices has recently been presented 43J (cf. 85 



W) 



In the US 3,140-county migration study, a statistical test of Ling [87] (designed for undi- 
rected graphs), based on the difference in the ranks of two edges, was employed in a heuristic 
manner 23|, pp. 7-8]. For example, the 3,148th largest doubly-stochastic value, 0.12972 (cor- 
responding to the flow from Maui County to Hawaii County), united the four counties of 
the state of Hawaii. The (considerably weaker) 7,939th largest value, 0.07340 (the link from 
Kauai County, Hawaii, to Nome, Alaska), integrated the four-county state of Hawaii into a 
much larger 2,464-county cluster. The difference of these two ranks, 4,192 = 7,340 - 3,148, 



is the isolation index or "survival time" of this state as a cluster. Reference to Table 1 in 
231 ] showed the significance of the state of Hawaii as a functional internal migration unit 



at the 0.01 level |23l . p. 7]. (In the computation of this table, the approximation was used 
that the number of edges in the relevant digraphs was a negligible proportion of all possible 
3, 140 x 3, 139 edges.; 



Also, the possibility of employing the asymptotic theory of random digraphs 88|, [89j for 



statistical testing purposes was raised in 23J. In this regard, it was necessary to consider the 
38,815 largest entry of the doubly-stochastic matrix to complete the hierarchical clustering 
of the 3,140 counties. The probability is 0.973469 that a random digraph with 3,140 nodes 



and 38,414 links is strongly connected 89_|, p. 361], where 0.973469 = e~ 2e ' ' , and 
38, 814 = 3140(log3140 + 4.30917). Evidence of systematic structure in the migration flows 
can, thus, be adduced, since the digraph based on the 38,814 greatest- valued links was not 



strongly connected [23j, p. 8] (cf. [901]). 

In a random digraph with a large number of nodes, the probability is close to one that 
all nodes are either isolated of lie in a single ("giant") strong component. The existence of 
intermediate-sized clusters is thus evidence of non-randomness, even if such groups are not 



themselves significant according to the isolation (difference-of-ranks) criterion of Ling 87]. 
With randomly-generated data and many taxonomic units, one would expect the two-stage 
procedure to yield a dendrogram exhibiting complete chaining. So, although single- linkage 
clustering is often criticized for producing chaining, chains can also be viewed simply as 
indications of inherent randomness in the data. In contrast to single-linkage clustering, 
strong component hierarchical clustering can merge more than two clusters (children) into 
one (parent) node. This serves to explain why fewer clusters (2,245) were generated in the 
intercounty migration study than the 3,139 that single-linkage (in the absence of ties) would 
produce. 

A. A cluster-analytic isolation criterion 

n. 

Dubes and Jain [91] provided "a semi-tutorial review of the state-of-the-art in cluster 
validity, or the verification of results from clustering algorithms" . Among other evaluative 
standards, they discussed isolation criteria, which "measure the distinctiveness or separation 
or gaps between a cluster and its environment" . Such a statistic was developed and applied 



in [92|] in order to extract a small proportion of 5,385 clusters (3,140 of them single units, 
673 pairs, 230 triples, 104 quartets,. . . ) for detailed examination based on the two-stage 
analysis of the 1965-1970 United States intercounty migration table [23]. 

The largest value of the isolation criterion, for all clusters of fewer than 2940 units, was 
attained by a region formed by the eight constituent counties of the state of Connecticut. 
(Groups formed by the application of the two-stage procedure to interareal migration data 



are, as a strong rule, composed of contiguous areas |3j, [15J. This occurs even in the absence 
of contiguity constraints, reflecting the distance decay of migration.) The 11,080th largest 
doubly-standardized entry, 5,666, corresponding to movement from New Haven to (New 
York City suburban) Fairfield, unified these eight counties (all row and column sums had 
been adjusted to 100,000). Not until the 16, 047 th largest doubly-standardized value, 4,085 
(the functional linkage from Litchfield, Connecticut to Berkshire, Massachusetts), viewing 
the clustering procedure as an agglomerative one, was Connecticut absorbed into a larger 
region. The isolation criterion for Connecticut is set equal to 

\ (16047-11080)-i 



25.3175 



r / \ (,lt>U4Y — 11U8UJ-1 

log ( (8 x 7 + 3132 x 3131)/(3140 x 3139)) (1) 



The term in large parentheses is the proportion of cells in the 3, 140 x 3, 140 table associ- 
ated with either movement within Connecticut or within the set of 3,132 complementary 
counties (since intracounty flows are not available, a diagonal correction is made). This 
term, raised to the power shown, is the probability (unadjusted for occupied cells) that 
none of 4967 = 16047 — 11080 consecutive doubly-standardized values would correspond 
to movement between Connecticut and its complement. Such a Connecticut-complement 
linkage could possibly result in a merger: an unobserved phenomenon. (For further details, 
including maps, discussion and extensive applications of the isolation criterion developed to 
the U. S. intercounty analysis, see JQ2 1 ].) This isolation score for the cluster formed by the 
four counties of Hawaii-discussed above-was 12.21, while the District of Columbia had the 
highest score, 23.81, for any single county [92J, Table I]. 

V. COMPLEMENTARY NETWORK FLOW PROCEDURE 

The creative, productive network analyst M. E. J. Newman has written: "Edge weights 
in networks have, with some exceptions . . . received relatively little attention in the physics 
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[emphasis added] literature for the excellent reason that in any field one is well advised to 
look at the simple cases first (unweighted networks). On the other hand, there are many 
cases where edge weights are known for networks, and to ignore them is to throw out a lot 
of data that, in theory at least, could help us to understand these systems better" [93]. Of 
course, the numerous (mostly, internal migration) applications of the two-stage procedure 
we have discussed above have, in fact, been to such weighted (and directed) networks. 



In 93j, Newman applied the famous Ford-Fulkerson max-flow/min-cut theorem 



9j, Chap. 



22] to weighted networks (which he mapped onto unwei ghte d multigraphs) . Earlier, this 



theorem had been used to study Spanish [76], Philippine [95| . and Brazilian, Mexican and 
Argentinian [96J internal migration, US interindustry flows [24 , pp. 18-28] 97J 73j, sec. Ill] 
and the international flow of college students [21| (cf. 98j)-all the corresponding flows now 
being left unadjusted, that is not (doubly- nor singly-) standardized. 

In this "multiterminal" approach, the maximum flow and the dual minimum edge cut- 
sets, between all ordered pairs of nodes are found. Those cuts (often few or even null in 
number) which partition the N nodes nontrivially-that is, into two sets each of cardinality 
greater than 1-are noted. The set in each such pair with the fewer nodes is regarded as a 
nodal cluster (region, in the geographic context). It has the interesting, defining property 
that fewer people migrate into (from) it, as a whole, than into (from) its node. In the Spanish 
context, the (nodal) province of Badajoz was found to have a particularly large out-migration 
sphere of influence, and the (Basque) province of Vizcaya (site of Bilbao and Guernica), an 
extensive in-migration field 76J|. In an analysis of 1967 US interindustry transactions based 
on 468 industries, among the industries functioning as nodes of production complexes with 
large numbers of members were: Advertising; Blast Furnaces and Steel Mills; Electronic 
Components; and Paperboard Containers and Boxes. Conversely, among those serving as 
nodes of consumption complexes were Petroleum Refining and Meat Animals 



73 



97|]. 



VI. CONCLUDING REMARKS 



The networks formed by the World Wide Web and the Internet have been the focus of 
much recent interest [l|. Their structures are typically represented by N x N adjacency 
matrices, the entries of which are simply or 1, rather than nonnegative numbers, as in 
internal migration and other flow tables. One might investigate whether the two-stage 
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double-standardization and hierarchical clustering, and the (complementary) multiterminal 
max-flow/min-cut procedures we have sought to bring to the attention of the active body of 
contemporary network theorists, could yield novel insights into these and other important 
modern structures. 

Though quite successful, evidently, in simultaneously revealing both hub-like and cluster- 
ing behavior in recorded flows, the indicated implementations of the two-stage procedure 
did not address the recently-emerging, theoretically-important issues of scale-free networks, 
power-law descriptions, network evolution and vulnerability, and small-world properties, 



among others, that have been stressed by Barabasi [ll] (and hi s co 



asi [1] 



100 



l eagu es and many others 



10J|.) One might-using 



in the growing field [99J]). (For critiques of these matters, see 
the indicated two-stage procedure-compare the hierarchical structure of geographic areas us- 
ing internal migration tables at different levels of geographic aggregation (counties, states, 
regions...) (cf. [84]). To again use the example of France, based on a 1962-68 21 x 21 inter- 
regional table, Region Parisienne was the most hub-like [3|, sec. 4.1] 6J, while using a finer 
89 x 89 1954-62 interdepartmental table, the dyad composed of Seine (that is Paris and its 
immediate suburbs) together with the encircling Seine-et-Oise (administratively eliminated 
in 1964) was most cosmopolitan [7| [3|, sec. 6.1]. (In 84 1. " two distinct approaches to 
assessing the effect of geographic scale on spatial interactions" were developed.) 

VII. AFTERWORD 

It might be of interest to describe the immediate motivation for this particular commu- 
nication. I had done no further work applying the methods described above after 1986, 
being aware of, but not absorbed in recent developments in network analysis. In May, 2008, 
Mathematical Reviews asked me to review the book of Tom Siegfried |2|] , chapter 8 of which 
is devoted to the on-going activities jn network analysis. This further led me (thanks to D. 
E. Boyce) to the book of Barabasi [l|. I, then, e-mailed Barabasi, pointing out the use of 
the clustering methodologies described above. In reply, he wrote, in part: "I guess you were 
another demo of everything being a question of timing- after a quick look it does appear 
that many things you did have came back as questions - with much more detailed data- 
again in the network community today. No, I was not aware of your papers, unfortunately, 
and it is hard to know how to get them back into the flow of the system." The present com- 
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munication might be seen as an effort in that direction, alerting present-day investigators to 
these demonstratedly fruitful research methodologies, and suggesting possible further ap pli- 
cations and theoretical analysis. (Additionally, we sent Barabasi the two-stage analysis \ 
for a 1972 40 x 40 interdistrict migration table for his native country, Romania-in which the 
capital of Bucharest was featured as most cosmopolitan in nature, and the coupled Black 
Sea districts of Constanta and Tulcea, as next most. His reply was: "Cool-thanks".) 
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